Actions
Specifies how the crawler converts web pages into Algolia records.
actions are a list of up to 30 separate instructions in your configuration that tell the crawler what information to extract from matching URLs and copyinto Algolia records.
Example action
For complete configurations, see the examples repository on GitHub.
Parameters
indexName
(required). Reference to the index used to store the action’s extracted records.
pathsToMatch
(required). URLs to which this action should apply.
recordExtractor
(required). Function for extracting information from a crawled page and transforming it into Algolia records for indexing.
autoGenerateObjectIDs
. Whether to generate an objectID
for records that don’t have one.
cache
. Whether the crawler should cache crawled pages.
discoveryPatterns
. Which intermediary web pages the crawler should visit.
fileTypesToMatch
. File types for crawling non-HTML documents.
hostnameAliases
. Key-value pairs to replace matching hostnames found in a sitemap, on a page, in canonical links, or redirects.
name
. Unique identifier for the action. Required if schedule
is set.
pathAliases
. Key-value pairs to replace matching paths with new values.
selectorsToMatch
. DOM selectors for nodes that must be present on the page to be processed.
schedule
. How often to perform a complete crawl for this action.